Non-destructive test-based assessment of uniaxial compressive strength and elasticity modulus of intact carbonate rocks using stacking ensemble models

The uniaxial compressive strength (UCS) and elasticity modulus (E) of intact rock are two fundamental requirements in engineering applications. These parameters can be measured either directly from the uniaxial compressive strength test or indirectly by using soft computing predictive models. In the present research, the UCS and E of intact carbonate rocks have been predicted by introducing two stacking ensemble learning models from non-destructive simple laboratory test results. For this purpose, dry unit weight, porosity, P‐wave velocity, Brinell surface harnesses, UCS, and static E were measured for 70 carbonate rock samples. Then, two stacking ensemble learning models were developed for estimating the UCS and E of the rocks. The applied stacking ensemble learning method integrates the advantages of two base models in the first level, where base models are multi-layer perceptron (MLP) and random forest (RF) for predicting UCS, and support vector regressor (SVR) and extreme gradient boosting (XGBoost) for predicting E. Grid search integrating k-fold cross validation is applied to tune the parameters of both base models and meta-learner. The results demonstrate the generalization ability of the stacking ensemble method in the comparison of base models in the terms of common performance measures. The values of coefficient of determination (R2) obtained from the stacking ensemble are 0.909 and 0.831 for predicting UCS and E, respectively. Similarly, the stacking ensemble yielded Root Mean Squared Error (RMSE) values of 1.967 and 0.621 for the prediction of UCS and E, respectively. Accordingly, the proposed models have superiority in the comparison of SVR and MLP as single models and RF and XGBoost as two representative ensemble models. Furthermore, sensitivity analysis is carried out to investigate the impact of input parameters.


Introduction
The uniaxial compressive strength (UCS) and elasticity modulus (E) of intact rock are two fundamental requirements for evaluating the strength, stability, and deformation behavior of rock in engineering projects.These parameters are obtained from a common and the most important rock mechanics laboratory test namely uniaxial compressive strength test.The UCS is a main input parameter in intact rock and rock mass classifications as well as failure criteria, and the E is a parameter to identify rock stiffness and deformability of geomaterials especially rocks.Direct measurement of the UCS and E in accordance with the recommended standards such as ISRM (International Society for Rock Mechanics) and ASTM (American Society for Testing and Materials) is difficult, time-consuming, expensive, and even impossible in weak, highly fractured, inherently anisotropic, highly foliated, and stratified or laminated rocks due to preparing suitable core specimens for performing the test.For this reason, the predictive approaches are often applied for the indirect estimation of the UCS and E. Recently, various machine learning and soft computing approaches have been used to predict the two mentioned parameters based on simple laboratory tests results.In this regard, the neural networks, support vector regression, random forest, extreme gradient boosting, multiple layer perceptron, fuzzy systems, evolutionary algorithms, etc. are common predictive approaches, and predictor rock properties such as unit weight, porosity, P-wave velocity, slake durability index, and rock surface harnesses are applicable parameters [1][2][3][4][5][6][7][8].The machine learning and soft computing approaches, unlike traditional statistical methods such as simple and multiple regressions which must be used in similar rocks, are sufficiently generic to use in various rock types because they fit all value ranges of the UCS and E to other rock properties.So, they are very suitable for the general applications as well as they reliable for all rock types.
In the last few years, some researchers applied the machine learning and soft computingbased techniques to estimate both UCS and E of various rock types [e.g. 9 -23].In this regard, Ghasemi et al. [24] applied model trees as a predicting approach and Schmidt hardness, effective porosity, dry unit weight, P-wave velocity, and slake durability index as input variables for predicting the UCS and E. They found that pruned and unpruned model trees provide suitable predictions of the parameters.Beiki et al. [18] predicted the UCS and E of carbonate rocks using genetic programming and multiple nonlinear regression models and found that the first method fitted the data more accurately than the second one, so it is the usefulness technique for estimating the UCS and E. Madhubabu et al. [25] and Aboutaleb et al. [19] have been used the MLR, ANN, and SVR for predicting both UCS and E of carbonate rocks together the R 2  and RMSE to examine the accuracy of the results.Their studies revealed that the ANN and SVR has a better predictive efficiency than the MLR for predicting the UCS and E from physical and index characteristics of the rocks.In another study, Rezaei and Asadizadeh [20] paid attention to the application of intelligent techniques combinations including ANFIS, genetic algorithm (GA), and particle swarm optimization (PSO) in order to predict the UCS of very strong rock types.Their studies proved that the combinations of the methods have higher capability than the regression model.Also, they found that the density and Schmidt rebound hardness had more related to the UCS than the rock porosity.Khan et al. [26] applied the MLR, ANN, RF, and KNN for predicting the UCS and static E from physical, chemical, and mechanical properties of marble rock namely density, porosity, P-wave velocity, and dynamic E under different thermal conditions.They found that the KNN and RF are reliable approaches to predict both UCS and E. Also, it was found that P-wave velocity has strong correlations with the UCS and E. Based on predictive performance, the RF model was proposed to predict the UCS and E as the best model.Shahani et al. [21,27] in a comprehensive study measured the UCS, E, dry and wet densities, and Brazilian tensile strength of soft sedimentary rocks and predicted the UCS and E by employing the MLR models, ANN, and ANFIS from other rock parameters.Their research indicated that the approaches are suitable ways to predict the UCS and E. It is also revealed that the prediction accuracy of the ANFIS is the best among all the employed models.karimizohre/SERock.The code is written in python language and run in the jupyter notebook.
Funding: The author(s) received no specific funding for this work.

Competing interests:
The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.Rukhaiyar and Samadhiya [28] performed a polyaxial strength model for intact sandstone based on artificial neural network (ANN) to find out the influence of each independent parameter namely uniaxial compressive strength (UCS), minor principal stress (σ 3 ), and intermediate principal stress (σ 2 ) on the strength of sandstone, i.e., the major principal stress at failure (σ 1 ).They found that the ANN based failure model gives the best result amongst all the considered polyaxial strength criteria, for the testing dataset.Behzadafshar et al. [29] proposed a new artificial neural network (ANN) model to approximate the elasticity modulus (E) of granite rock samples based on laboratory tests results.In this research, Rock index tests including point load, p-wave velocity and Schmidt hammer together with uniaxial compressive strength (UCS) tests were carried out to prepare a database comprised of 62 datasets for the analysis.Based on sensitivity analysis results for the developed ANN model, p-wave velocity has the most effect on E of the rock samples.

Abbreviations
It can be easily understood from the mentioned studies that 1) statistical and soft computing approaches are suitable methods to predict engineering characteristics of rocks such as the UCS and E. 2) the machine learning and soft computing approaches are provided more accurate results than statistical methods such as simple and multiple regressions.3) Combining several models in a proper way achieves a better result compared to only one model.
The purpose of this paper is to create an ensemble machine learning method for estimating UCS and E, based on existing research that shows how the combination of certain machine learning methods can improve their performance.However, the commonly used combinations frameworks are typically boosting and bagging.Boosting is vulnerable to overfitting and neither of these frameworks utilizes all the available data for learning base models.This paper seeks to utilize the knowledge contained within all the data by developing a stacking ensemble model to improve prediction accuracy.A little study in predicting rock brightness is done, Koopialipoor et al. [30] developed a stacking structure by employing the MLP, KNN, and RF for predicting the E from other rock properties namely porosity, Schmidt rebound hardness, P-wave velocity, and point load index.So, in the present research an attempt has been made to examine a stacking ensemble learning method for predicting the UCS and E of travertines and limestones as two major category of carbonate rocks which are one of the most abundant and common rock types in earth surface and they are encountered in many engineering projects worldwide.The contribution of this study can be listed as 1) assessing the engineering properties of 70 carbonate rock samples including 30 travertines and 40 limestones.2) two stacking ensemble models are developed to predict E and UCS by enhancing some base models.The parameter tunning of the models is done by grid search.3) Sensitivity analysis is performed to assess the relative importance of input variables on E and UCS.4) extensive numerical experiments are conducted to compare the effectiveness of the proposed method with some popular learning methods.The results confirm that the proposed method outperform Support Vector Regressor (SVR), Random Forest (RF), extreme gradient boosting (XGBoost), and Multi-Layer Perceptron (MLP).

Materials and experimental studies
Providing process of necessary rock materials and their characteristics is planned in three steps namely sample selection and specimen preparation, experimental procedures and rock characteristics (laboratory investigations) as well as desk studies.Methodology flowchart of the research is presented in Fig 1 .The former step includes operations for selecting suitable and applicable carbonate building stones extracted from quarries and used in many cities of Iran.Then, rock specimens with suitable shapes and dimensions were prepared for considered laboratory tests.The second step includes a comprehensive laboratory test program for evaluating engineering characteristics and behaviors of the selected samples.In the third step, data analysis was performed and uniaxial compressive strength and modulus of elasticity were predicted using stacking ensemble models.

Sample selection and specimen preparation
The number of 70 carbonate rock samples including 30 travertines and 40 limestones were taken from some quarries in different region of Iran, and moved to engineering geology laboratory.They are various in apparent properties such as color, luster, surface texture, etc.The travertines are light brown to gray in color, without any vein in surface, and with cavities filled by light brown to black materials in handy specimens.The limestones are white to cream or gray in color, with small, thin and light to dark veins, and without cavities in their surfaces.In the specimen preparation step, cube-shaped specimens with dimensions of 54×54×135mm were prepared from the selected samples in a stone cutting workshop before laboratory tests.The prepared specimens were washed with water to remove dust deposited on the stone surfaces during rock cutting and then have been dried at 105˚C in oven, and were weighed by a scale with accuracy of 0.01 g.Finally, they were tested in dry condition in accordance with ASTM [31] and ISRM [32].Also, necessary thin sections were provided to investigate petrographic properties of the rock samples.

Experimental procedures and rock characteristics
The mineralogical and petrographic studies, physical properties tests (dry unit weight and effective porosity), ultrasonic wave velocity test, and Brinell hardness test were carried out for determining considered characteristics.The microscopy studies on thin sections were done in accordance with ASTM [33] and ISRM [32] to identify mineral content, texture, and petrographic properties of the rock samples.Fig 2 shows four microscopic images of some tested rocks as representative samples in polarized light (XPL).In the travertines, the samples are consisted of micrite and sparry calcite.The sparry calcite crystals are grown in internal walls of cavities.Some of the cavities are completely filled.In the limestones, the sample are consisted of sparry calcite or micrite.In the samples that composed of micrite, the sparry calcite crystals are grown in internal walls of cavities and some of the cavities are completely filled.There are fractures and veins filled by calcite crystals and iron dioxide.Some sample have fossil.
Physical properties tests were carried out on the prepared specimens with regular shape method to determine dry unit weight (γd) and effective porosity (ne) in accordance with ASTM [31] and ISRM [32].Four number of experiments were used for each rock sample and averaging was done from the four obtained numbers.The samples that showed wrong results were repeated again to reduce the test error.So, a total of 280 specimens were tested in this step to determine the average values of γ d and n e .Ultrasonic wave velocity (P-wave velocity) of the selected rock samples was determined in the laboratory using a Peroceq digital ultrasonic tester (Model: ND 180; Trade name; Pundit Lab+; Transit time range: 0.1-9999 μs; Energizing pulse: 125, 250, 350, 500 V; Frequency range: 24-500 KHz) in accordance with ASTM [34] and ISRM [32].The Brinell hardness test is performed based on ASTM [35].The test method is generally used to test materials with coarse structure and rough surface.In this test method, a constant load (F) is applied to a carbide ball during a predetermined time period.Then, the created impression of the ball is measured with an optical system at two perpendicular diameters.The Brinell hardness number is calculated as: where BHN is Brinell hardness number in kgf/mm 2 , F is applied load in kgf (F = 1000 kgf), D b is diameter of indenter ball (D b = 10 mm), and D i is diameter of impression (mm).Uniaxial compressive strength test is described as a suggested method for determining UCS and E by ISRM [36] and ASTM-D-2938 [37].In the current research, four prepared specimens were tested from each sample to determine UCS and E. The two mentioned parameters are calculated from the following equations: In these equations, UCS is uniaxial compressive strength (MPa), F is maximum applied force upon the tested specimen (N), D is diameter of the tested cylindrical specimen (mm), E is secant modulus of elasticity (MPa), Δσ is the change of applied stress upon the tested specimen (MPa), and Δε a is the change of axial strength of the specimen during the test.[35].
After calculating the required engineering characteristics of the rocks, distribution curve and histogram of the rock properties are provided which are presented in Fig 3, where Ishikawa formula is exploited for computing the number of bins [39].
The descriptive statistics of data including is also given in Table 2.The Pearson's product moment correlation coefficient of variables indicates the strength of the linear relationship between independent and dependent variables [40].It is computed and is given in

Machine learning algorithms
Ensemble methods aggregate some models to enhance their generalizability and robustness.Two common ensemble methods are bagging and boosting applied to rock properties prediction in the literature [27,41,42], while stacking ensemble, a promising ensemble method, has received less attention in the related works.Compared to bagging and boosting, stacking ensemble method has three important characteristics; 1) it fully utilized training data, the model can be learned from all samples.2) training in its first level is done by cross-validation, thus the trained model is robust and training overfitting phenomenon is not occurred.3) it integrates different types of base learners; therefore, it takes their advantages and deals with their disadvantages.The last issue is so important since finding a suitable model for various datasets is so difficult.This section describes the stacking ensemble and base models exploited in this research.

Stacking ensemble learning
The stacking ensemble learning model was introduced by wolpert [43].Recently, it has been successfully applied in various applications [44][45][46][47][48][49][50].It combines some models together in  For optimizing base models' hyperparameters, the grid-search algorithm integrating with k-fold cross validation is applied.In the first step, the value range of hyperparameters are set, and then the model is trained by considering all combinations of hyperparameter values.Each combination is evaluated by computing the performance measure based on k-fold cross validation setting.Best combination is selected to train corresponding base model on all training data.These steps are shown in Fig 7.
After training base models on original data set, meta-learner is trained on synthetic data set.The details of stacking ensemble are given in Fig 8 .To achieve suitable stacking ensemble model, base models should be diverse and have high performance.In this research, we design two stacking ensembles; one for predicting the UCS, and another for estimating the static E of carbonate rocks.Four classifiers are studied as base models, two base models for each stacking ensembles.The base models are SVR, RF, XGBoost, and MLP, since they have been successfully applied to the rock properties prediction [51].SVR is a suitable method for dealing with nonlinear problems and has been achieved to high quality results when available data are rare [52].XGBoost outperformed other machine learning methods in many challenges [53].MLP is a well-known and powerful learning method to resolve challenges in the rock data, and RF frequently applied to rock properties prediction in the literature [41,[54][55][56].These methods will be summarily described in next sections.

Support vector regression
The SVR is a prominent model for predicting dependent variable employing statistical learning theory.It is based on the structural risk minimization principal and exploits the kernel  where b is intercept, o 2 R d k is the weight feature vector, and �ð:Þ : R D !R d k is a mapping from the input space to high dimensional new space, and d k is the dimension of feature space that is implicitly defined.The notation of <.,.> indicate the dot product [57].A kernel function is commonly employed in SVR to transform input data to a feature space with high dimensional for considering data non-linearity.Radial Basis Function (RBF) is the most widely used kernel function that computes the similarity of x i and x j by expðÀ gkx i À x j k 2 Þ, where γ is its parameter.
The optimization problem of SVR consists of two terms of regularization, and loss function as follow: where l 2 −norm and ε-insensitive is applied as loss function and regularization terms, respectively, and C>0 controls the trade-off between two terms.ξ and ξ* are slack variables that are introduce for tolerating misclassification in training data.For more details on SVR refer to Smola [57].SVR's various variants have been successfully applied for predicting engineering properties of rocks [58,59].

Random forest
The random forest is an ensemble of classification and regression trees (CART) with suitable characteristics including 1) supporting nonlinearity, 2) being non-parametric, 3) fast training, 4) random subspace, and 5) being resistance to overfitting.The base tree models are built on randomly selected input features of randomly samples of original data set by bagging manner.This randomness increases the diversity of the base trees.For building each tree, input space partitions to two parts in the recursive manner in order to data in each leaf be pure as much as possible.The purity in the regression task is commonly defined by mean squared error measure, where a regression model is learned from data in each leaf.The predicted value in the regression task is computed by averaging forecasting values of each base learner.The RF has been recently applied for various applications including the prediction of engineering properties of rocks [41,54,55].The number of base trees and maximum depth of them are two

XGBoost
The XGBoost is an effective scalable method that combines some trees in an iterative boosting manner.It learns some tree models like the RF, but it differs from the RF in the training details. ith predicted value ŷi is computed as where f k is an independent tree and F ¼ ff ðxÞ ¼ o qðxÞ gðq : R D !T; o 2 R T Þ indicates the space of regression trees l is a convex loss function that vanishes the violation of precited values ŷi from target value y i ,O is the regularization term that prevent from overfitting and consists of two terms: the number of leaves, T, and the L2-norm of ω controlling the complexity of the model.The optimization of Eq 7 is done in the additive manner.For studying more details about it, please refer to Chen [60].Three important parameters of the XGBoost are the number of trees, maximum tree depth, and learning rate.Since the XGBoost learns each tree to correct the error of the existing sequence of trees, it is subject to overfit.To prevent overfitting, a weight factor is assigned to the correction made by each new tree.This weight factor is learning rate.

MLP
A Neural network is a powerful model for representing both linear and non-linear relations between inputs and outputs.The MLP is a most common neural network that it exploits back propagation algorithm for learning its weights.It is a supervised and feedforward network.
Training phase of the MLP includes two steps: first, the selection of neural network architecture, and second, adjusting connections' weights.A typical MLP has an input layer and an output layer.It may be had one or more hidden layers that extract some important features of input data and are not directly accessible.Some neurons are included in each layer and an activation function is assigned to each neuron.The MLP can learn nonlinear patterns from data.Two well-known weight optimization methods in the MLP are L-BFGS (Limited-Memory Broyden-Fletcher-Goldfarb-Shanno) [61] and Adam (adaptive moment estimation) [62].The L-BFGS belongs to a class of Quasi-Newton methods and the Adam is an efficient stochastic optimization method that only needs first-order gradients plus little memory requirement.

Model development, results and discussions
This part explains the application of the stacking ensemble model for estimating the UCS and static E of carbonate rocks.Section 3.1 explains some settings included in our experiments, section 3.2 describes the performance measures that exploited for assessing the proposed methods, and finally Section 3.3 gives the results and discuss about them.

Settings
The dataset is spitted randomly to training and test, where 20% of data is considered as test and 80% of data form training data.The input data is normalized before applying machine learning methods to scale all features in the range of [0,1].This step is so important to eliminate the effect of varying ranges of various features.This normalization is done by the following relation: where x d i is the value of dth input variable of ith data, x d min and x d max are minimum and maximum values of dth input variable, z d i is the scaled value of x d i , and D is 4 in our data which equals the number of independent variables including γ d ,n e ,v p , and HBN.

Performance measures
In this paper, the performance of the estimation is presented in the terms of the coefficient of determination (R 2 ), Root Mean Squared Error (RMSE), Mean Squared Error (MSE), the Mean Absolute Error (MAE), Variance Accounted For (VAF), Index of Scatter (IOS), agreement index (IOA), Mean absolute percentage error (MAPE), Weighted Mean absolute percentage error (WMAPE), Performance Index (PI), and a20index are computed.These metrics are frequently employed to assess regression issues [63,64].RMSE, MSE, MAE, MAPE, WMAPE, and IOS indicated the error prediction.R 2 specifies the appropriateness of the fitted model and is in the range of [0,1] and larger R 2 values and smaller MSE, MAE, RMSE, MAPE, WMAPE, and IOS values indicate better performance.The measure of a20index indicates the quantity of samples that correspond to the observed values within the margin of ±20% deviation, as identified by m20, in relation to the predicted values.A higher a20index value specifies better predictive accuracy.The relations of these measures are given as follows: RMSE ¼ ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi

Prediction performances
a. Tunning hyperparameters of base models.In the first stage, we train some state-ofthe-art machine learning methods consisting of XGBoost, RF, SVR and MLP.Grid search in the combination of k-fold cross validation is applied for parameter optimization of all base models.Mean of MSE is computed in each parameter configurations of grid search, and the best result on the validation data indicates the best parameter settings.The optimized parameters and their ranges are given in Table 3.
A MLP model with one hidden layer also is considered in our experiments as a base model.The increasing hidden layer may reduce the error of the model, but also increases both network complexity and training time and leads overfitting.The influence of the number of hidden layer neurons and the method of weight optimization are illustrated in Fig 9 .As demonstrated in this figure, for estimating the UCS, the number of neurons is more effective than the E, and the number of neurons and weight optimization method were set to 14 and L-BFGS, respectively.For predicting the E, Adam weight optimization function concludes more stable result than L-BFGS.The number of neurons by considering this weight optimization function does not have many impacts on MSE, our experiments select 17 neurons.
For the base model of the random forest, the effect of parameters is shown in Fig 10 .The number of estimators is not so effective on the MSE for predicting E, and with maximum tree depth less than 3 overfitting is not occurred, based on the figure.The number of 200 trees with maximum depth of 3 are best configuration for predicting E. Similar manner exists for predicting UCS, where 400 trees with maximum of depth of 6 are found by grid search.The impact of the SVR parameters on the MSE is also illustrated in Fig 11.In both parts of this figure, two parameters are fixed to their best values and only the mean of negative of mean squared error is plotted for another parameter.As it shown, the SVR are sensitive to all parameters.C = 10 and γ = 1 are the best parameter settings for predicting both of UCS and E. The value of ε is set to 1 and 0.1 for predicting UCS and E, respectively.
The influence of the XGBoost's parameters is also assessed and the results are given in Fig 12 .Similar to the SVR parameters, one parameter value is changed in each time and other parameter values are set to their best values.The suitable values of these parameters for predicting E got achieved to 0.3, 5, and 400, respectively.The increasing values of all parameters leads to high error prediction; however, this effect is observed more strongly in the number of predictors.Increasing the number of predictors increases the model complexity and causes overfitting.Similar trend is maintained in predicting UCS, while the best values are 0.01, 10, and 800 for learning rate, maximum of depth, and the number of tree predictors.
b. Testing the stacking ensemble model.We conducted two stacking ensembles; one for predicting E by exploiting the SVR and XGBoost as base models, and another stacking ensemble model for predicting the UCS by using the RF and MLP as the first stage's models.The meta-learners are the SVM and MLP for predicting the E and UCS, respectively.
After hyperparameter tunning of base models, the hyperparameters of meta-learners are also tuned by grid search and then, two stacking ensemble models were trained for predicting the E and UCS.The range of parameters of meta-learners is also according to values given in The obtained performance results for predicting the UCS by four base models and the proposed stacking ensemble are also listed in Table 4.The suitability of the proposed model was confirmed with MAE, MSE, RMSE, R 2 , IOA, and IOS.The results confirm that the stacking ensemble is superior on all base models.
Table 5 gives the obtained performance metrics of the developed stacking ensemble and four base learners applied for predicting the E on the testing data.It is clear from the obtained results  For more Analysis, the training time of the studied methods are also compared in Table 6.The reported training time was obtained from a workstation equipped with an Intel(R) Xeon (R) W-2150B CPU operating at a frequency of 3 GHz, complemented by 64G of RAM.

Sensitivity analysis
Sensitivity analysis assesses the efficacy of input parameters in forecasting the output parameter(s).This examination serves to ascertain the parameter that exerts the greatest influence on the prediction.Moreover, it can be employed to construct the most optimal AI models by disregarding inconsequential input parameters.These analyses of sensitivity may take on either a  linear or nonlinear form.Numerous scholars have employed the cosine amplitude method (CAM) in conducting sensitivity analysis [64,65].CAM determines the sensitivity of input variables using below equation: x ik x jk ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi P m k¼1 x 2 ik where X i is the ith input, and X j I the jth output.The computed CAM for four input variables with UCS (MPa) and E (GPa) are illustrated in Fig 18 .As it shown, n e has the lowest influence on both two outputs and other three input parameters have relatively high impact.

Discussion
In spite of high generalization ability and good accuracy of the studied base models in this research, applying them on prediction of the rock properties have some imperfections.The instances may be highly nonlinear and the SVR exploits only once from kernel function to transform the sample data to the high-dimensional feature space, while a single mapping cannot guarantee finding the optimal separable feature space.Exploiting MLP only in one level can lead to getting stuck in local optimum.The RF and XGBoost are based on bagging and boosting, respectively and suffer from their limitations.Both of them are the ensemble of some trees and don't exploit the advantages of other models.
In the current study, it has been verified that the combination of certain models within the stacking ensemble framework can enhance the achieved results according to various metrics.The proposed approach exhibits significant improvements in RMSE, MSE, MAE, VAF, IOS, MAPE, WMAPE, PI, and a20index when estimating E. This suggests that the proposed method can serve as a viable alternative to other methods for estimating E, as it exhibits low error and high explanatory power in relation to the variance in E.Moreover, the stacking ensemble method also enhances the measures of RMSE, MSE, MAE, and IOS when predicting UCS.This indicates that the proposed method effectively captures the variability in UCS and produces predictions that are closer to the true values compared to the base models.However, it is worth noting that the proposed method does not yield satisfactory results in terms of VAF, MAPE, PI, WMAPE, and a20-index when predicting UCS.These metrics indicate that the proposed method explains less variance and has wider prediction intervals compared to other models.Therefore, further research is needed to address this limitation and explore the impact of marginal data on the estimation of UCS.
Additionally, the comparison of training time between the proposed method and other methods supports its suitability from a time perspective.It is important to acknowledge that the proposed method is an ensemble method, which logically results in longer training times compared to single methods such as SVM and MLP.In the comparison with XGBoost, the proposed method demonstrates significantly faster training times.On the other hand, RF exhibits the shortest training time due to its composition of regression trees that can be learned quickly.Furthermore, the parallel learning of the models in the first level can also contribute to reducing the overall training time.

Conclusions
The most important results and findings of the research can be concluded as follows: 1) The laboratory experiments upon the selected carbonate rocks indicated that they are moderate in strength with low values of the E. They have high surface hardness based on the Brinell hardness test.These rocks have moderate to high γ d , very low to moderate n e , and moderate to very high v p relating to their composing materials and the presence of cavities, veins, and fractures in their textures and structures.2) The rocks are recognized very suitable for predicting their engineering properties by using machine learning and soft computing approaches.Therefore, two stacking ensemble methods have been developed for predicting their UCSs and Es from four non-destructive simple laboratory test results namely γ d , n e , v p , and HBN, where two base models were considered in the first level of each stacking ensemble.
3) The performance of the developed methods is confirmed in the terms of MAE, MSE, RMSE, and R 2 .These measures were calculated as 1.657, 3.867, 1.967, and 0.909 for predicting UCS, and 0.483, 0.386, 0.621, and 0.831 for predicting E, respectively.The values of 84.447, 0.130, 0.870, 0.786, 0.115, 1.000, and 0.029 for VAF, MAPE, PI, a20-index, WMAPE, IOA, and IOS further validate the superiority of the proposed method over both base methods, bagging and boosting methods.
4) The obtained results confirm that the exploiting stacking ensemble reduces the error prediction in comparison to the SVR, and MLP as two popular single models.This is due to the fitting ability of individual models can enhance by incorporating them together.
5) Further, our experiments confirm that the stacking ensemble results are superior to the RF and XGBoost as two widely used and successful ensemble models that are based on wellknown ensemble methods of bagging and boosting.This is due to the stacking ensemble combines various high-quality models and can correct their errors.6) It was observed that the stacking ensemble that was suggested did not result in any improvement in the prediction error of UCS on a few performance metrics, which suggests that further research on this topic may be necessary.It is worth to mention, however applying stacking ensemble is more time consuming than using base models alone, but according to the amount of data, the time spent in practice is quite reasonable.
7) Applying suitable stacking ensemble is proposed for predicting other rock properties.

Fig 9 .Fig 10 .Fig 11 .
Fig 9. Effects of the number of neurons in hidden layer and the solver of weight optimization in the MLP on MSE of predicting a) UCS, and b) E. https://doi.org/10.1371/journal.pone.0302944.g009

Fig 18 .
Fig 18. RSE of input variables on the a) UCS (MPa), and b) E (GPa).https://doi.org/10.1371/journal.pone.0302944.g018 Table 1 summarizes the average values of the obtained engineering properties of the tested rocks.The minimum and maximum values of UCS of the tested rocks are 11.66 and 38.49MPa, respectively, [38]h are moderate values based on ISRM[32].The minimum and maximum values of E, γ d , n e , and v p are between 2.17 and 8.03 GPa, 22.01 and 25.78 kN/m 3 , 0.37 and 7.55%, and 3759.79 and 5347.06 m/s, respectively.In accordance with Anon[38], the tested rocks have very low values of E, moderate to high values of γ d , very low to moderate values of n e , moderate to very high values of v p .The minimum and maximum values of HBN are 271.81 and 975.79 kgf/ mm 2 , respectively, which are high values based on ASTM E10-18

Table 1 . Engineering properties of the rocks.
order to enhance accuracy, generalization ability, and robustness.Stacking ensemble model consists of two levels, some base models are learned in the first level and one meta-learner is trained in the second level.Meta-learner's training data set is the altered version of original dataset and involves some synthetic features depend on base models' predictions.The training phase of stacking ensemble is illustrated in Fig 5.The models in the first level predicts the output variable in k-fold cross-validation manner, where one fold is considered as a validation set and the method is learned on other folds.The prediction process for first level is shown inFig 6.

Table 4 . Performance results for predicting the UCS (MPa) in the testing phase.
For more clarity, the predicted and experimental values of the UCS and E by various models studied in this research in both training and testing phase are displayed in Figs 14 and 15, respectively.Furthermore, the scatter plot of actual and predicted values by four base models and stacking ensemble model are shown in Figs 16 and 17.The correlation of 0.83 and 0.91 confirm the suitability of the proposed method in the comparison of the base models.The results show that the predicted values in stacking ensembles are closer to observed valued than the output of other models.